Zoklet.net

Go Back   Zoklet.net > Technology > Technophiles and Technophiliacs > Codes of all kinds
Welcome, Dfg.
You last visited: Today at 12:10 AM
Private Messages: Unread 0, Total 801.
User CP Home IRC Chat FAQ Community Calendar New Posts Search Quick Links Log Out

Advertisement
Advertisement
No logs - Anonymous IP
Reply
 
Thread Tools Search this Thread
Old 05-02-2010, 09:38 PM   #1
calatron
serious busniess
 
Join Date: Mar 2009
Location: Hudson Valley/Rochester, NY
Posts: 1,905
Thanks: 5
Thanked 290 Times in 198 Posts
Default Python Craigslist Search Script

http://cal.freeshell.org/2010/04/pyt...search-script/

It's a pretty simple script. It performs a search for your query on any number of areas.

Start the script and enter the cities you want separated by commas but no spaces:
newyork,albany,buffalo

Then enter your query as normal:
dell laptop

It will create an html document with all the results for that query in those areas.

Download: http://cal.freeshell.org/wp-content/...10/04/clget.py

Code:

Code:
import re
import datetime
import time
import string
import urllib
import urllib2
 
results = re.compile('<p>.+<div>sort by')
delay = 100
 
t = datetime.datetime.now()
tyme = time.mktime(t.timetuple())
 
print "Welcome to CLget!"
print ""
 
cityIN = raw_input('City or cities separated by commas and no spaces: ')
query = raw_input("Input query: ")
 
s = ','
 
cities = re.split(s,cityIN)
 
for city in list(set(cities)):
    url = "http://" + city + ".craigslist.org/search/?areaID=126&subAreaID=&query=" + query.replace(' ', ',') + "&catAbb=sss"
 
    #Setup headers to spoof Mozilla
    dat = None
    ua = "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.1.4) Gecko/20091007 Firefox/3.5.4"
    head = {'User-agent': ua}
 
    errorcount = 0
 
    #Get page
    req = urllib2.Request(url, dat, head)
    try:
        response = urllib2.urlopen(req)
    except urllib2.HTTPError, e:
        if errorcount < 1:
            errorcount = 1
            print "Request failed, retrying in " + delay + " seconds"
            time.sleep(int(delay))
            response = urllib2.urlopen(req)
    except urllib2.URLError, e:
        if errorcount < 1:
            errorcount = 1
            print "Request failed, retrying in " + delay + " seconds"
            time.sleep(int(delay))
            response = urllib2.urlopen(req)
 
    msg = response.read()
    errorcount = 0
 
    res = results.findall(msg)
    res = str(res)
    res = res.replace('[', '')
    res = res.replace(']', '')
    res = res.replace('sort by', '')
    res = res.replace(chr(39), '')
 
    outp = open("results" + str(tyme) + ".html", "a")
    outp.write(city)
    outp.write(str(res))
    outp.close()
calatron is offline   Reply With Quote Multi-Quote This Message Quick reply to this message Thanks
Reply

Bookmarks

Tags
craigslist, python, script, search

Quick Reply
Message:
Options


Currently Active Users Viewing This Thread: 1 (1 members and 0 guests)
Dfg

Posting Rules
You may post new threads
You may post replies
You may not post attachments
You may edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Search Engines: TinEye [Reverse Image Search Engine] Dfg LOL, Internet 5 05-30-2009 12:47 AM


All times are GMT +5. The time now is 02:22 AM.


Hot Topics
Join our Chatroom!
Users: 8
Messages/minute: 0
Topic: "Only rule: be nice or I'll cut your fucking face off, dumbshit"
Users: 27
Messages/minute: 1.6
Topic: "http://codelove.org :: Below is above in 2 codes 1 love. :: wh..."
Users: 18
Messages/minute: 5
Topic: "http://www.literotica...."
Advertisements
Your ad could go right HERE! Contact us!

Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.